MDHIM: A Parallel Key/Value Framework for HPC
نویسندگان
چکیده
The long-expected convergence of High Performance Computing and Big Data Analytics is upon us. Unfortunately, the computing environments created for each workload are not necessarily conducive for the other. In this paper, we evaluate the ability of traditional high performance computing architectures to run big data analytics. We discover and describe limitations which prevent the seamless utilization of existing big data analytics tools and software. Specifically, we evaluate the effectiveness of distributed key-value stores for manipulating large data sets across tightly coupled parallel supercomputers. Although existing distributed key-value stores have proven highly effective in cloud environments, we find their performance on HPC clusters to be degraded. Accordingly, we have built an HPC specific key-value stored called the Multi-Dimensional Hierarchical Indexing Middleware (MDHIM). Using standard big data benchmarks we find that MDHIM performance more than triples that of Cassandra on HPC systems.
منابع مشابه
Programming with the HPC++ Parallel Standard Template Library
We present an overview of the HPC++ Parallel Standard Template Library (PSTL), a parallel version of the C++ Standard Template Library (STL). The PSTL is part of HPC++, a C++ library and language extension framework being developed by the HPC++ consortium as a standard model for portable parallel programming in C++. The PSTL includes distributed versions of the seven STL containers (vector, lis...
متن کاملInterpretive Performance Prediction for Parallel Application Development
Application software development for High-Performance Parallel Computing (HPC) is a non-trivial process; its complexity can be primarily attributed to the increased degrees of freedom that have to be resolved and tuned in such an environment. Performance prediction tools enable a developer to evaluate available design alternatives and can assist in HPC application software development. In this ...
متن کاملPOSITION PAPER – pFLogger: The parallel Fortran logging framework for HPC applications
In the context of high performance computing (HPC), software investments in support of text-based diagnostics, which monitor a running application, are typically limited compared to those for other types of IO. Examples of such diagnostics include reiteration of configuration parameters, progress indicators, simple metrics (e.g., mass conservation, convergence of solvers, etc.), and timers. To ...
متن کاملA convergence of key-value storage systems from clouds to supercomputers
This paper presents a convergence of distributed Key-Value storage systems in clouds and supercomputers. It specifically presents ZHT, a zero-hop distributed key-value store system, which has been tuned for the requirements of high-end computing systems. ZHT aims to be a building block for future distributed systems, such as parallel and distributed file systems, distributed job management syst...
متن کاملSoftware Issues in High-performance Computing and a Framework for the Development of Hpc Applications
We identify the following key problems faced by HPC software: (1) the large gap between HPC design and implementation models in application development, (2) achieving high performance for a single application on di erent HPC platforms, and (3) accommodating constant changes in both problem speci cation and target architecture as computational methods and architectures evolve. To attack these pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015